<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>Recordings</title>
    <!-- <link rel="shortcut icon" href="../images/favicon.ico" type="image/x-icon"> -->
    <link rel="stylesheet" href="../css/common/reset.css">
    <link rel="stylesheet" href="../css/detail.min.css">
</head>

<body>
    <!-- 页面锚点 -->
    <div class="fix_left"></div>
    <!-- 头部 -->
    <div class="header">
        <div class="w header_top">
            <div class="top clearfix">
                <div class="fl welcom"></div>
                <div class="fr phone"></div>
            </div>
            <div class="header_nav">
                <div class="logo"></div>
                <div class="tab">
                    <a class="tab_item " href="/index.html">Home</a>
                    <a class="tab_item" href="/pages/category.html">Curriculum Vitae</a>
                    <a class="tab_item active" href="#">Recordings</a>
                </div>
            </div>
        </div>
    </div>
    <!-- 内容 -->
    <section class="w mt66">
        <div class="container">
            <a href="../index.html">Xiaowei Xu</a> > Recordings
        </div>
        <br>
        <h3 align="CENTER">Recordings of Xiaowei Xu</h3>
        <h5 align="CENTER">April 7, 2021</H5>

        <center><a target="_blank" href="cv_xiaoweixu.pdf">Download in PDF <img border="0" src="images/pdf.gif"></a></center>

        <br>
        <br>
    </section>
    <div class="w">
        <h3> Dec, 2020 </h3>

        <li><span class="label label-success">1 Dec, 2020</span><a href="https://arxiv.org/pdf/2003.07923.pdf">""
            </a> </li>
        <p> <span class="label label-default">extensive</span>
        </p>While many studies were encountered utilizing transfer learning with unlabeled medical data for
        classification and diagnosis tasks, only a few cases were found adopting this approach for
        segmentation. This work investigates two strategies for combining knowledge from labeled and unlabeled
        data for segmentation.
        </p>The first approach uses a WTA-CAE to learn features from unlabeled images and uses those features
        to initialize the weights of an F-Net that is then trained to perform the segmentation task. This
        strategy can be likened to pretraining of the segmentation network with self-supervision and is
        hereafter denoted by S-TL.
        </p>The second approach simultaneously trains an F-Net for segmentation and WTA-CAE for
        reconstruction,which corresponds to multi-task learning with a self-supervised task, is denoted by
        S-MTL.
        </p>Results show both training strategies are more effective with a large ratio of unlabeled to labeled
        training data. Multi-task learning performed slightly better, but tends to demand more computational
        resources than the transfer learning strategy.

        </p>In accordance with the original paper [4], the larger kernel produced better results for the liver
        segmentation task.
        </p><b>Larger patches resulted in higher DSC</b>.
        </p><b>A combined Dice - Cross Entropy (Dice-CE) loss is more favorable than Focal loss</b>.
        </p>The original paper increased the number of feature maps for coarser resolution levels, with [16,
        24, 32, 48] for resolutions 0 − 3. Here, the number of feature maps per level was kept constant, [32,
        32, 32, 32], to allow a uniform sparsity level for the autoencoders. Using a uniform number of feature
        maps slightly lowered the average DSC, but also slightly lessened the standard deviation. The drop in
        performance was regarded as negligible and all further experiments used a uniform number of feature
        maps for the sake of the other training strategies.
        </p>Spatial sparsity is achieved in the network by retaining only the highest activations per feature
        map in the bottleneck, creating a sparsity level that is proportional to the number of feature maps.
        Each minibatch activates different filters in the feature maps, which circumvents the dead filter
        problem. Sparsity is an essential form of regularization, since without it the network would learn
        useless delta functions which merely copy the input instead of extracting useful features.
        </p>Generally, the earlier layers of a CNN learn common, low level image features, whereas the later
        layers learn detailed, task specific features. For most applications of transfer learning, fine-tuning
        only the later pretrained layers is often considered sufficient for computer vision tasks. However, as
        shown by [37] layer-wise fine-tuning experiments are necessary in order to determine the optimal
        transfer procedure for a specific application.
        <p> <span class="label label-default">comment</span>
            <li><span class="label label-success">2 Dec, 2020</span><a href="https://arxiv.org/pdf/1805.11247.pdf">"MICROSCOPY
                    CELL SEGMENTATION VIA CONVOLUTIONAL LSTM NETWORKS
                    " </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>In order to exploit cell dynamics we propose a novel segmentation architecture which integrates
            Con- volutional Long Short Term Memory (C-LSTM) with the U-Net.
        </p><b>For unet, adding C-LSTM to the encoder obtains higher accuracy than to the decoder</b>.

        </p> U-net and FCNs are limited by their inability to incorporate temporal information, which can
        facilitate segmenta- tion of individual touching cells or of cells that are par- tially visible.
        </p>The C-LSTM has re- cently been used to address the analysis of both temporal image sequences, such
        as next frame prediction [16], and volu- metric data sets [17, 18]. In [18] C-LSTM is applied in multi-
        ple directions for the segmentation of 3D data represented as a stack of 2D slices. Another approach
        for 3D brain structure segmentation is proposed in [17], where each slice is separately fed into a
        U-Net architecture, and only the output then fed into bi-directional C-LSTMs.
        </p>We note that, unlike [17] which was designed and evaluated on 3D brain segmentation, the proposed
        novel architec- ture is an intertwined composition of the two concepts rather than a pipeline.
        <p> <span class="label label-default">comment</span> The improvement is relatively small. Dataset and
            code are published as
            https: //github.com/arbellea/LSTM-UNet.git.

            <li><span class="label label-success">3 Dec, 2020</span><a href="https://arxiv.org/pdf/1802.06955.pdf">"Recurrent
                    Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image
                    Segmentation
                    " </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>We propose a Recurrent Convolutional Neural Network (RCNN) based on U-Net as well as a
            Recurrent Residual Convolutional Neural Network (RRCNN) based on U-Net models, which are named
            RU-Net and R2U-Net respectively. The proposed models utilize the power of U-Net, Residual Network,
            as well as RCNN

        </p>There are some limitations of medical image segmentation including data scarcity and class
        imbalance.
        </p>Different data transformation or augmentation techniques (data whitening, rotation, translation,
        and scaling) are applied for increasing the number of labeled samples available [12, 13, and 14].
        </p>In addition, <b>patch based approaches are used for solving class imbalance problems</b>.
        </p>The architecture for segmentation tasks generally requires almost double the number of network
        parameters when compared to the architecture of the classification tasks.
        </p>The U-Net model provides several advantages for segmentation tasks: first, this model allows for
        the use of global location and context at the same time. Second, it works with very few training
        samples and provides better performance for segmentation tasks [12]. Third, an end-to-end pipeline
        process the entire image in the forward pass and directly produces segmentation maps. This ensures that
        U-Net preserves the full context of the input images, which is a major advantage when compared to
        patch-based segmentation approaches [12, 14].
        <p> <span class="label label-default">comment</span>

            <li><span class="label label-success">4 Dec, 2020</span><a href="https://arxiv.org/pdf/1608.04117.pdf">"The
                    Importance of Skip Connections in Biomedical Image Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p><b>The recent results suggest that depth can act as a regularizer</b>.
        </p>cshort skip connections appear to stabilize updates</b>.
        </p>The variant with both long and short skip connections is not only the one that performs best but
        also converges faster than without short skip connections.
        </p>Long skip connections are retained, at least the shallow parts of the model can be updated (see
        both sides of Figure 4(b)) as these connections provide shortcuts for gradient flow
        </p>Batch normalization was observed to increase the maximal updatable depth of the network
        </p><b>Even randomly initialized weights can confer a surprisingly large portion of a model’s
            performance after training only the classifier</b>.
        </p><b>Although long skip connections provide a shortcut for gradient flow in shallow layers, they do
            not alleviate the vanishing gradient problem in deep networks</b>.

        <p> <span class="label label-default">comment</span>

            <li><span class="label label-success">5 Dec, 2020</span><a href="https://arxiv.org/pdf/1604.02677.pdf">"DCAN:
                    Deep Contour-Aware Networks for Accurate Gland Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>Based on the FCN, we push it further by harnessing multi-level contextual feature
            representations, which include different levels of contextual information.
        </p>Direct training a network with such a large depth may fall into a local minima. So weighted
        auxiliary classifiers C1-C3 are added into the network to further strengthen the training process.
        </p>However, it’s still quite hard to separate the touching glands by leveraging only on the likelihood
        of gland objects due to the essential ambiguity in touching regions. This is rooted in the downsampling
        path causing spatial information loss along with feature abstraction. The boundary information formed
        by epithelial cell nuclei provides good complementary cues for splitting objects.

        </p>When incorporated with multi-task regularization during the training, the discriminative capability
        of intermediate features can be further improved
        </p><b>The motivation behind this is that the downsampling path aims at extracting the high level
            abstraction information, while the upsampling path predicting the score masks in a pixel-wise way.</b>

        <p> <span class="label label-default">comment</span>

            <li><span class="label label-success">6 Dec, 2020</span><a href="https://arxiv.org/pdf/1809.10486.pdf">"nnU-Net:
                    Self-adapting Framework for U-Net-Based Medical Image Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p><b>We argue the strong case for taking away superfluous bells and whistles of many proposed
                network designs and instead focus on the remaining aspects that make out the performance and
                generalizability of a method.</b>
        </p>The Medical Segmentation Decathlon is intended to specifically address this issue: participants in
        this challenge are asked to create a segmentation algorithm that generalizes across 10 datasets
        corresponding to different entities.
        </p>These algorithms may dynamically adapt to the specifics of a particular dataset, but are only
        allowed to do so in a fully automatic manner.
        </p><b>We hypothesize that some of the architectural modifications presented recently are in part
            overfitted to specific problems or could suffer from imperfect validation that results from
            sub-optimal reimplementations of the state-of-the-art.</b>
        </p><b>We believe that the remaining interdependent choices regarding the exact architecture, pre-
            processing, training, inference and post-processing quite often cause the U-Net to underperform
            when used as a benchmark.</b>
        </p>These are steps where much of the nets’ performance can be gained or respectively lost:
        preprocessing (e.g. resampling and normalization), training (e.g. loss, optimizer setting and data
        augmentation), inference (e.g. patch-based strategy and ensembling across test-time augmentations and
        models) and a potential post-processing (e.g. enforcing single connected components if applicable).
        </p>All intensity values occurring within the segmentation masks of the training dataset are collected
        and the entire dataset is normalized by clipping to the [0.5, 99.5] percentiles of these intensity
        values, followed by a z- score normalization based on the mean and standard deviation of all collected
        intensity values.
        </p>A connected component analysis of all ground truth segmentation labels is performed on the training
        data.

        </p>
        <p> <span class="label label-default">comment</span>

            <li><span class="label label-success">7 Dec, 2020</span><a href="https://arxiv.org/pdf/1606.04797.pdf">"V-Net:
                    Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation
                    " </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p><b>Motivated by [16] and other works discouraging the use of max-pooling operations in CNNs,
                pooling layers have been replaced in our approach by convolutional ones.</b>
        </p>Replacing pooling operations with convolutional ones results also to networks that, depending on
        the specific implementation, can have a smaller memory footprint during training, due to the fact that
        no switches mapping the output of pooling layers back to their inputs are needed for back-propagation,
        and that can be better understood and analysed [19] by applying only de-convolutions instead of
        un-pooling operations.

        </p>We introduce a novel objective function, that we optimise during training, based on Dice
        coefficient. In this way we can deal with situations where there is a strong imbalance between the
        number of foreground and background voxels.
        </p><b>Using Dice loss, we do not need to assign weights to samples of different classes to establish
            the right balance between foreground and background voxels.</b>
        <p> <span class="label label-default">comment</span>


            <li><span class="label label-success">8 Dec, 2020</span><a href="https://link.springer.com/chapter/10.1007/978-3-030-59710-8_1">"Attention,
                    Suggestion and Annotation: A Deep Active Learning Framework for Biomedical Image
                    Segmentation " </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>We propose a deep active learning framework that combines the attention gated fully
            convolutional network (ag-FCN) and the distribution discrepancy based active learning algorithm
            (dd-AL) to significantly reduce the annota- tion effort by iteratively annotating the most
            informative samples to train the ag-FCN for the better segmentation performance.
        </p>In addition to outperforming the current best annotation suggestion algorithm [10] on biomedical
        image segmentation in terms of accuracy (Table 1), our framework is more efficient
        </p>
        <p> <span class="label label-default">comment</span> Actually I think this paper has not too much
            improvement compared with the paper by Yang Lin.

            <li><span class="label label-success">9 Dec, 2020</span><a href="https://link.springer.com/chapter/10.1007/978-3-030-59710-8_12">"Feature
                    Preserving Smoothing Provides Simple and Effective Data Augmentation for Medical Image
                    Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>Recent work suggests that CNNs for image classification suffer from a bias towards texture, and
            that reducing it can increase the network’s accuracy.
        </p>We hypothesize that CNNs for medical image segmentation might suffer from a similar bias.

        </p>We propose to reduce it by augmenting the training data with feature preserving smoothing, which
        reduces noise and high-frequency textural features, while preserving semantically meaningful
        boundaries.
        </p>We mainly focus on Total Variation based denoising [20], which we consider to be a natural match
        for augmentation in image segmentation problems due to its piecewise constant, segmentation-like
        output.
        <p> <span class="label label-default">comment</span>


            <li><span class="label label-success">10 Dec, 2020</span><a href="https://link.springer.com/chapter/10.1007/978-3-030-59710-8_18">"Learning
                    to Segment When Experts Disagree " </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>The predictive performance of these algorithms depend on the quality of labels, especially in
            med- ical image domain, where both the annotation cost and inter-observer variability are high.
        </p>In a typical annotation collection process, differ- ent clinical experts provide their estimates of
        the “true” segmentation labels under the influence of their levels of expertise and biases. Treating
        these noisy labels blindly as the ground truth can adversely affect the performance of supervised
        segmentation models.

        </p>In this work, we present a neural network architecture for jointly learning, from noisy
        observations alone, both the reliability of individual annotators and the true segmentation label
        distributions. The separation of the annotators’ characteristics and true segmentation label is
        achieved by encouraging the estimated annotators to be maximally unreliable while achieving high
        fidelity with the training data.
        </p>https://github.com/UCLBrain/MSLS.
        <p> <span class="label label-default">comment</span> Interesting work for multiple annotations for each
            sample.


            <li><span class="label label-success">11 Dec, 2020</span><a href="https://arxiv.org/abs/1911.08777">"High-Order
                    Attention Networks for Medical Image Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>In order to capture global context information, we propose High-order Attention (HA), a novel
            attention module with adaptive receptive fields and dynamic weights.
        </p>It is ideal that the inner-class similarity S(a0, a1) is large and the inter-class sim- ilarity
        S(a, b) closes to zero. But there are many factors confuse CNNs to
        extract low-quality features, such as the similarity between the organ and background and the
        cross-domain input. Consequently, the inter-class similarity S(a,b)
        will be relatively large and cause that each pixel aggregates many context information from other
        classes. Thus we propose an attention propagation block to
        reduce the inter-class similarity S(a, b) and enlarge the inner-class similarity S(a0, a1).

        </p>Notably, the adaptive receptive filed and dynamic weights are beneficial to the medical objects
        with variant appearance, and capturing contexts at high-order is robust to the low-quality features.
        </p>
        <p> <span class="label label-default">comment</span>


            <li><span class="label label-success">12 Dec, 2020</span><a href="pdf">"Automatic Data Augmentation
                    for 3D Medical Image Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>Dataaugmentationisaneffectiveanduniversaltechniquefor improving generalization performance of
            deep neural networks.
        </p>It could enrich diversity of training samples that is essential in medical image segmentation tasks
        because 1) the scale of medical image dataset is typi- cally smaller, which may increase the risk of
        overfitting; 2) the shape and modality of different objects such as organs or tumors are unique, thus
        requiring customized data augmentation policy.

        </p>However, most data aug- mentation implementations are hand-crafted and suboptimal in medical image
        processing. To fully exploit the potential of data augmentation, we propose an efficient algorithm to
        automatically search for the optimal augmentation strategies.
        </p>We formulate the coupled optimization w.r.t. network weights and augmentation parameters into a
        differentiable form by means of stochastic relaxation. This formulation allows us to apply alternative
        gradient-based methods to solve it, i.e. stochastic natural gradient method with adaptive step-size.
        <p> <span class="label label-default">comment</span>


            <li><span class="label label-success">13 Dec, 2020</span><a href="pdf">"Dual-Teacher: Integrating
                    Intra-domain and Inter-domain Teachers
                    for Annotation-Efficient Cardiac Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>Medical image annotations are prohibitively time-consuming and expensive to obtain. To
            alleviate annotation scarcity, many approaches have been developed to efficiently utilize extra
            information, e.g., semi-supervised learning further exploring plentiful unlabeled data, domain
            adaptation including multi-modality learning and unsupervised domain adaptation resorting to the
            prior knowledge from additional modality.
        </p>In this paper, we aim to investigate the feasibility of simulta- neously leveraging abundant
        unlabeled data and well-established cross- modality data for annotation-efficient medical image
        segmentation. To this end, we propose a novel semi-supervised domain adaptation app- roach, namely
        Dual-Teacher, where the student model not only learns from labeled target data (e.g., CT), but also
        explores unlabeled target data and labeled source data (e.g., MR) by two teacher models.

        </p>Specif- ically, the student model learns the knowledge of unlabeled target data from intra-domain
        teacher by encouraging prediction consistency, as well as the shape priors embedded in labeled source
        data from inter-domain teacher via knowledge distillation.
        </p>
        <p> <span class="label label-default">comment</span>


            <li><span class="label label-success">14 Dec, 2020</span><a href="pdf">"Memory-Efficient Automatic
                    Kidney and Tumor Segmentation Based
                    on Non-local Context Guided 3D U-Net" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>Different from the traditional 3D U-Net, we implement a lightweight 3D U-Net with depthwise
            separable convolution (DSC), which can not only avoid over fitting but also improve the
            generalization ability.
        </p>By encoding long range pixel- wise dependencies in features and recalibrating the weight of
        channels, we also develop a non-local context guided mechanism to capture global context and fully
        utilize the long range dependencies during the fea- ture selection.

        <p> <span class="label label-default">comment</span>


            <li><span class="label label-success">15 Dec, 2020</span><a href="https://link.springer.com/chapter/10.1007/978-3-030-59719-1_32">"Deep
                    Active Contour Network for Medical Image Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>Recently, convolutional neural networks (CNNs) have achieved tremendous success in this task,
            however, it performs poorly at recognizing precise object boundary due to the information loss in
            the successive downsampling layers.
        </p>To overcome this problem, we integrate an active contour model (convexified Chan-Vese model) into
        the CNN structure (DenseUNet), forming a new framework called deep active contour network (DACN).
        Instead of manual setting, DACN applies a CNN backbone to learn the initialization and parameters of
        active con- tour model (ACM) automatically.

        </p>
        <p> <span class="label label-default">comment</span>

            <li><span class="label label-success">15 Dec, 2020</span><a href="https://link.springer.com/chapter/10.1007/978-3-030-59719-1_59">"Graph
                    Reasoning and Shape Constraints for Cardiac Segmentation in Congenital Heart Defect" </a>
            </li>
            <p> <span class="label label-default">extensive</span>
            </p>We propose a CHD segmentation method based on graph reasoning and shape constraints in the
            study. Graph reasoning is used to capture global relations, through projecting a set of features
            into the interaction space and then makes relational reasoning. After reasoning, relation-aware
            fea- tures are mapped back to the original feature map.
        </p><b>Code: https://github.com/liut969/CHD-Seg.</b>

        </p><span><img border="0" width="1200" src="dataset_chd_seg.png" /></a></span>

        <p> <span class="label label-default">comment</span> The paper used our code. The improvement is good,
            but still low compared with clinical usage.

            <li><span class="label label-success">16 Dec, 2020</span><a href="https://arxiv.org/abs/2001.07645">"SAUNet:
                    Shape Attentive U-Net for Interpretable Medical Image Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p><b>Code: https://github.com/sunjesse/shape-attentive-unet</b>
        </p>More recently, there has been a shift to utilizing deep learning and fully convolutional neural
        networks (CNNs) to perform image seg- mentation that has yielded state-of-the-art results in many
        public bench- mark datasets.

        </p>Despite the progress of deep learning in medical image segmentation, standard CNNs are still not
        fully adopted in clinical set- tings as they lack robustness and interpretability. Shapes are generally
        more meaningful features than solely textures of images, which are fea- tures regular CNNs learn,
        causing a lack of robustness.
        </p>Thus, we present a new architecture called Shape Attentive U-Net (SAUNet) which focuses on model
        interpretability and robustness. The proposed architecture attempts to address these limitations by the
        use of a secondary shape stream that captures rich shape-dependent information in parallel with the
        regular texture stream. Furthermore, we suggest multi-resolution saliency maps can be learned using our
        dual-attention decoder module which allows for multi-level interpretability and mitigates the need for
        additional computations post hoc.
        <p> <span class="label label-default">comment</span> Good work.

            <li><span class="label label-success">17 Dec, 2020</span><a href="https://arxiv.org/pdf/2012.13364.pdf">"Spatio-temporal
                    Multi-task Learning for Cardiac
                    MRI Left Ventricle Quantification" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>In this paper,
            we propose a spatio-temporal multi-task learning approach to
            obtain a complete set of measurements quantifying cardiac LV
            morphology, regional-wall thickness (RWT), and additionally
            detecting the cardiac phase cycle (systole and diastole) for a
            given 3D Cine-magnetic resonance (MR) image sequence.
        </p>We
        first segment cardiac LVs using an encoder-decoder network
        and then introduce a multitask framework to regress 11 LV
        indices and classify the cardiac phase, as parallel tasks during
        model optimization. The proposed deep learning model is based
        on the 3D spatio-temporal convolutions, which extract spatial
        and temporal features from MR images.
        </p>All cardiac images were preprocessed by the challenge
        organizer, including landmark labeling to find the ROI, rotation
        to align the volumes, ROI cropping, and resizing. The resulting
        images are 80 × 80 in size. The LVQuan 2018 dataset
        images vary a lot in terms of contrast and brightness.

        </p><span><img border="0" width="1200" src="cardiac_mri.png" /></a></span>
        </p>The code and
        model are available at: https://github.com/sulaimanvesal/
        CardiacQuanNet.
        <p> <span class="label label-default">comment</span>

            <li><span class="label label-success">18 Dec, 2020</span><a href="https://arxiv.org/pdf/2003.04052.pdf">"On
                    the Texture Bias for
                    Few-Shot CNN Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>Despite the initial belief that Convolutional Neural Networks
            (CNNs) are driven by shapes to perform visual recognition tasks, recent
            evidence suggests that texture bias in CNNs provides higher performing
            models when learning on large labeled training datasets. This contrasts
            with the perceptual bias in the human visual cortex, which has a stronger
            preference towards shape components.
        </p>Perceptual differences may explain why CNNs achieve human-level performance when large labeled
        datasets are available, but their performance significantly degrades in
        low-labeled data scenarios, such as few-shot semantic segmentation.

        </p><span><img border="0" width="1200" src="texture_segmentation.png" /></a></span>

        </p>To
        remove the texture bias in the context of few-shot learning, we propose a
        novel architecture that integrates a set of Difference of Gaussians (DoG)
        to attenuate high-frequency local components in the feature space. This
        produces a set of modified feature maps, whose high-frequency components are diminished at different
        standard deviation values of the Gaussian distribution in the spatial domain.
        <p> <span class="label label-default">comment</span>


            <li><span class="label label-success">19 Dec, 2020</span><a href="pdf">"EPSNet: Efficient Panoptic
                    Segmentation
                    Network with Cross-layer Attention Fusion" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>Panoptic segmentation is a scene parsing task which unifies
            semantic segmentation and instance segmentation into one single task.
            However, the current state-of-the-art studies did not take too much concern on inference time.
        </p>In this work, we propose an Efficient Panoptic Segmentation Network (EPSNet) to tackle the panoptic
        segmentation tasks with fast inference speed. Basically, EPSNet generates masks
        based on simple linear combination of prototype masks and mask coefficients.
        </p><span><img border="0" width="1200" src="fast_segmentation.png" /></a></span>

        </p>The light-weight network branches for instance segmentation
        and semantic segmentation only need to predict mask coefficients and
        produce masks with the shared prototypes predicted by prototype network branch. Furthermore, to enhance
        the quality of shared prototypes,
        we adopt a module called ”cross-layer attention fusion module”, which
        aggregates the multi-scale features with attention mechanism helping
        them capture the long-range dependencies between each other.
        <p> <span class="label label-default">comment</span> Code
            is available at: github.com/neo85824/epsnet.



            <li><span class="label label-success">20 Dec, 2020</span><a href="https://arxiv.org/pdf/2012.12425.pdf">"RAP-Net:
                    Coarse-to-Fine Multi-Organ Segmentation with Single Random Anatomical Prior
                    " </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>Performing coarse-to-fine abdominal multi-organ
            segmentation facilitates to extract high-resolution
            segmentation minimizing the lost of spatial contextual
            information. However, current coarse-to-refine approaches
            require a significant number of models to perform single
            organ refine segmentation corresponding to the extracted
            organ region of interest (ROI).
        </p>We propose a coarse-to-fine
        pipeline, which starts from the extraction of the global prior
        context of multiple organs from 3D volumes using a lowresolution coarse network, followed by a fine
        phase that uses
        a single refined model to segment all abdominal organs
        instead of multiple organ corresponding models.
        </p><span><img border="0" width="1200" src="high_resolution_seg.png" /></a></span>

        <p> <span class="label label-default">comment</span> The proposed code is available at
            https://github.com/MASILab/coarse_to_fine_prior_seg.


            <li><span class="label label-success">21 Dec, 2020</span><a href="https://arxiv.org/pdf/2002.05773.pdf">"ACEnet:
                    Anatomical Context-Encoding Network for Neuroanatomy Segmentation" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>Since 3D deep learning models suffer from high computational cost, 2D deep
            learning methods are favored for their computational efficiency. However, existing 2D deep learning
            methods are
            not equipped to effectively capture 3D spatial contextual information that is needed to achieve
            accurate brain
            structure segmentation.
        </p>However, the existing 2D segmentation methods ignore intrinsic 3D
        contextual information of 3D brain MR images, which could potentially improve the segmentation
        performance if
        properly utilized.
        </p><span><img border="0" width="1200" src="acenet.png" /></a></span>

        </p>The network backbone is an U-Net (Ronneberger et al., 2015) with 4 densely connected blocks for
        both the encoder
        and the decoder. To fuse both spatial-wise and channel-wise information within local receptive fields,
        spatial and channel Squeeze-and-Excitation (sc-SE) (Roy et al., 2018) is applied to each encoder,
        bottleneck, and
        decoder dense blocks.
        </p>Spatial Context Encoding Module: The consecutive image slices are regarded
        as a stack of 2D images with dimensions of 𝐻 × 𝑊 × 𝐶, where 𝐻 and 𝑊 are spatial dimensions of the
        2D image
        slices and 𝐶 is the number of 2D image slices, rather than as a 3D volume with dimensions of 𝐻 × 𝑊 ×
        𝐶 × 1.
        Therefore, the input to the spatial context encoding module is of the same dimensions as the 2D input.
        </p>Anatomical Context Encoding Module: Particularly, the detection of the
        presence of specific brain structures is formulated as a classification problem with an anatomical
        context encoding
        loss (ACE-loss) to optimize the network under a direct supervision.
        <p> <span class="label label-default">comment</span>
            https://github.com/ymli39/ACEnet-for-Neuroanatomy-Segmentation.


            <li><span class="label label-success">15 Dec, 2020</span><a href="pdf">"M" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>
        </p>

        </p>
        <p> <span class="label label-default">comment</span>

            <li><span class="label label-success">15 Dec, 2020</span><a href="pdf">"M" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>
        </p>

        </p>
        <p> <span class="label label-default">comment</span>


            <li><span class="label label-success">15 Dec, 2020</span><a href="pdf">"M" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>
        </p>

        </p>
        <p> <span class="label label-default">comment</span>



            <li><span class="label label-success">15 Dec, 2020</span><a href="pdf">"M" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>
        </p>

        </p>
        <p> <span class="label label-default">comment</span>



            <li><span class="label label-success">15 Dec, 2020</span><a href="pdf">"M" </a> </li>
            <p> <span class="label label-default">extensive</span>
            </p>
        </p>

        </p>
        <p> <span class="label label-default">comment</span>
    </div>
    <footer class="footer_desc">
        ©2015-2021 Jax Wong. and Ken lee. All Rights Reserved.
    </footer>
</body>
<script data-main="../js/modules/category.min.js" src="../js/lib/require.js"></script>

</html>